Introduction This analysis identifies variables significantly different across sample groups to plot a heatmap.

Description

Project

This a demo

Analysis

Cluster co-expressed genes across 9 NCI60 tumor types

Go to project home

Summary statistics

Table 1. The mean, standard deviation, and range of all variables.

Min. 1st Qu. Median Mean 3rd Qu. Max.
Mean 2.274 4.3560 5.6260 5.6820 6.8810 12.160
SD 0.500 0.5625 0.6536 0.7684 0.8506 3.108
Range 2.010 2.7500 3.2900 3.6180 4.2000 8.840
Go to project home

Variable selection

Run ANOVA

Run 1-way ANOVA on each variable to identify those significantly different across all sample groups.

Figure 1. Distribution of ANOVA p values. Number of variables with p values within each 0.01 interval.

Select variables

Significant variables were selected using the following criteria:

  • Select variables with ANOVA p values less than 10^{-5}
  • Stop if the number of remaining variables is between 100 and 2000, else
    • if the number remaining variable is less than 100, select the top 100 variables with the smallest p values
    • if the number remaining variable is greater than 2000, select the top 2000 variables with the smallest p values

As a result, 396 variables were selected. Click here to view these variables.

Go to project home

Heatmap

Figure 2. Color-coded data of selected variables different across sample groups (red = higher). Variables (rows) were clustered based on their correlation to each other and samples were arranged by groups.

Go to project home

Appendix

Check out the RoCA home page for more information.

Reproduce this report

To reproduce this report:

  1. Find the data analysis template you want to use and an example of its pairing YAML file here and download the YAML example to your working directory

  2. To generate a new report using your own input data and parameter, edit the following items in the YAML file:

    • output : where you want to put the output files
    • home : the URL if you have a home page for your project
    • analyst : your name
    • description : background information about your project, analysis, etc.
    • input : where are your input data, read instruction for preparing them
    • parameter : parameters for this analysis; read instruction about how to prepare input data
  3. Run the code below within R Console or RStudio, preferablly with a new R session:

if (!require(devtools)) { install.packages('devtools'); require(devtools); }
if (!require(RCurl)) { install.packages('RCurl'); require(RCurl); }
if (!require(RoCA)) { install_github('zhezhangsh/RoCAR'); require(RoCA); }

CreateReport(filename.yaml);  # filename.yaml is the YAML file you just downloaded and edited for your analysis

If there is no complaint, go to the output folder and open the index.html file to view report.

Session information

## R version 3.2.2 (2015-08-14)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: OS X 10.10.5 (Yosemite)
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] gplots_3.0.1        htmlwidgets_0.8     DT_0.2             
##  [4] RoCA_0.0.0.9000     awsomics_0.0.0.9000 RCurl_1.95-4.8     
##  [7] bitops_1.0-6        devtools_1.13.2     yaml_2.1.13        
## [10] rmarkdown_1.3       knitr_1.14         
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.12       magrittr_1.5       stringr_1.2.0     
##  [4] highr_0.6          caTools_1.17.1     tools_3.2.2       
##  [7] KernSmooth_2.23-15 withr_1.0.2        htmltools_0.3.5   
## [10] gtools_3.5.0       rprojroot_1.2      digest_0.6.12     
## [13] formatR_1.4        memoise_1.1.0      evaluate_0.9      
## [16] gdata_2.17.0       stringi_1.1.1      backports_1.1.0   
## [19] jsonlite_1.0
Go to project home

END OF DOCUMENT